Skip to content

Fix server rotation resource waste with unified sequential retry logic#37

Merged
Sunwuyuan merged 5 commits intomainfrom
copilot/fix-get-effective-server-url
Jan 19, 2026
Merged

Fix server rotation resource waste with unified sequential retry logic#37
Sunwuyuan merged 5 commits intomainfrom
copilot/fix-get-effective-server-url

Conversation

Copy link
Contributor

Copilot AI commented Jan 19, 2026

All API requests were attempting full server rotation on every call, causing resource waste. This PR fixes the issue by implementing a clean, unified approach where all requests use the same sequential retry logic.

Changes

Unified server rotation

  • All API requests now use tryWithRotation directly for consistent behavior
  • Servers are tried sequentially from top to bottom on every request
  • On any error, the next server in the list is attempted
  • No optimization, no session tracking - simple and predictable

Updated API clients

  • kvServerProvider.js: All methods (loadData, saveData, loadKeys, loadNamespaceInfo, updateNamespaceInfo) → tryWithRotation
  • api.js: getNamespaceInfo → tryWithRotation
  • Connection testing: Uses tryWithRotation (unchanged)

Implementation

// All requests use the same unified approach
const data = await tryWithRotation(async (serverUrl) => {
  return await axios.get(`${serverUrl}/kv/${key}`);
});
// Always tries servers in order: server 1, server 2, etc.
// No distinction between primary and other servers

Benefits

  • Simple and predictable - Same behavior for all requests
  • No duplicate requests - Each server tried only once per request
  • Easy to understand - No complex optimization logic
  • Unified codebase - Single pattern across all API methods
  • Resource efficient - Tries servers sequentially, stops on first success
Original prompt

修复关于getEffectiveServerUrl 的代码,这似乎会导致资源浪费,改成直接使用第一个url访问资源,出网络问题了再试下一个。修复测试连接时第一个域错误就被抛出不继续尝试的bug。检查其他这方面的代码问题。不要执行npm lint


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…network errors

Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Jan 19, 2026

Deploying classworks with  Cloudflare Pages  Cloudflare Pages

Latest commit: 17f7b46
Status: ✅  Deploy successful!
Preview URL: https://d7ebca51.classworks.pages.dev
Branch Preview URL: https://copilot-fix-get-effective-se.classworks.pages.dev

View logs

Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix resource waste in getEffectiveServerUrl code Optimize server rotation to use primary server first, rotate only on network errors Jan 19, 2026
@Sunwuyuan Sunwuyuan marked this pull request as ready for review January 19, 2026 12:56
Copilot AI review requested due to automatic review settings January 19, 2026 12:56
Copilot AI requested a review from Sunwuyuan January 19, 2026 12:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes the server rotation mechanism to reduce unnecessary requests by using a "primary server first" strategy. Instead of attempting rotation on every request, the system now uses the last successful server and only falls back to rotation when network errors occur.

Changes:

  • Introduced tryWithPrimaryServer() function that attempts the primary (last working) server first, rotating only on network errors
  • Added isNetworkError() helper to distinguish between network errors (requiring rotation) and application errors (404, 401, etc.)
  • Updated all API clients (kvServerProvider.js, api.js) to use the new optimized approach while preserving full rotation for connection testing

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/utils/serverRotation.js Core optimization logic: added primaryServerUrl tracking, isNetworkError() detection, and tryWithPrimaryServer() function
src/utils/providers/kvServerProvider.js Updated all methods (loadData, saveData, loadKeys, loadNamespaceInfo, updateNamespaceInfo) to use tryWithPrimaryServer
src/utils/api.js Updated getNamespaceInfo to use tryWithPrimaryServer
src/utils/socketClient.js Removed unused tryWithRotation import

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +123 to +139
// Network errors from axios typically have no response or specific error codes
if (!error.response) {
return true; // No response = network issue
}

// Server timeout or connection errors
if (error.code === 'ECONNABORTED' || error.code === 'ETIMEDOUT' ||
error.code === 'ENOTFOUND' || error.code === 'ECONNREFUSED') {
return true;
}

// 5xx errors might indicate server issues worth retrying
const status = error.response?.status;
if (status >= 500) {
return true;
}

Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isNetworkError function has a logical issue. When an error has a response (line 124-126 check fails), but has one of the error codes like 'ECONNABORTED' or 'ETIMEDOUT', the function will still check for these codes on line 129-131. However, these error codes are typically set when there is no response object. If error.response exists, these codes are unlikely to be present, making this second check unreachable or ineffective.

Consider restructuring the logic to check error codes before checking for response, or verify that axios sets both error.code and error.response simultaneously for timeout scenarios.

Suggested change
// Network errors from axios typically have no response or specific error codes
if (!error.response) {
return true; // No response = network issue
}
// Server timeout or connection errors
if (error.code === 'ECONNABORTED' || error.code === 'ETIMEDOUT' ||
error.code === 'ENOTFOUND' || error.code === 'ECONNREFUSED') {
return true;
}
// 5xx errors might indicate server issues worth retrying
const status = error.response?.status;
if (status >= 500) {
return true;
}
// Server timeout or connection errors (may or may not have a response)
if (
error.code === "ECONNABORTED" ||
error.code === "ETIMEDOUT" ||
error.code === "ENOTFOUND" ||
error.code === "ECONNREFUSED"
) {
return true;
}
// Network errors from axios often have no response at all
if (!error.response) {
return true; // No response = network issue
}
// 5xx errors might indicate server issues worth retrying
const status = error.response?.status;
if (status >= 500) {
return true;
}

Copilot uses AI. Check for mistakes.
Comment on lines +14 to +15
// Track the current primary server (the one that's currently working)
let primaryServerUrl = null;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module-level variable primaryServerUrl introduces potential race condition issues in concurrent scenarios. If multiple API requests fail and trigger rotation simultaneously, they could race to update primaryServerUrl, potentially leading to inconsistent state. Additionally, this state is never reset when servers become unavailable, which could cause the application to stick with a failed server even after configuration changes.

Consider using a more robust state management approach, such as storing this in a singleton with proper synchronization, or implementing a mechanism to periodically validate and reset the primary server.

Copilot uses AI. Check for mistakes.
Comment on lines 92 to 94
* Get the effective server URL for the current provider
* For classworkscloud, returns the first server in the list
* For classworkscloud, returns the primary server (last known working) or first server in the list
* For other providers, returns the configured domain
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation comment states "For classworkscloud, returns the primary server (last known working)" but doesn't mention the important behavior that this primary server persists across application sessions in the module-level state. This could be misleading since developers might expect this to reset on page reload or application restart. The primary server will actually reset to null on page reload, not persist the "last known working" server.

Update the documentation to clarify that the primary server is tracked only during the current application session and resets to the first server in the list on page reload.

Copilot uses AI. Check for mistakes.

// 5xx errors might indicate server issues worth retrying
const status = error.response?.status;
if (status >= 500) {
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function only checks for status codes >= 500, but doesn't handle the case where status is exactly 500. While this works correctly (500 >= 500 is true), it would be clearer and more maintainable to use status >= 500 && status < 600 or add a comment explaining that all 5xx errors should trigger rotation. This makes the intent explicit that we're only targeting server errors, not theoretical status codes above 600.

Suggested change
if (status >= 500) {
if (status >= 500 && status < 600) {

Copilot uses AI. Check for mistakes.
Comment on lines +169 to +170
// Use full rotation, which will update the primary server if a different one succeeds
return await tryWithRotation(operation, options);
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When tryWithPrimaryServer falls back to tryWithRotation on a network error, it will attempt the primary server again as part of the rotation. This means the failing primary server will be tried twice in sequence - once at line 164, and again when tryWithRotation starts iterating through the server list (which includes the primary server). This results in unnecessary duplicate requests to the same failing server.

Consider modifying the rotation logic to skip the already-tried primary server, or track which servers have already been attempted in the current request chain.

Suggested change
// Use full rotation, which will update the primary server if a different one succeeds
return await tryWithRotation(operation, options);
// Wrap the operation to skip the already-tried primary server during rotation
const wrappedOperation = async (serverUrl) => {
if (serverUrl === primaryUrl) {
// Skip making a duplicate request to the primary server; let rotation move on
throw new Error(`Skipping already-tried primary server during rotation: ${serverUrl}`);
}
return operation(serverUrl);
};
// Use full rotation, which will update the primary server if a different one succeeds
return await tryWithRotation(wrappedOperation, options);

Copilot uses AI. Check for mistakes.
@Sunwuyuan
Copy link
Member

@copilot 修改,对于主服务器问题只需要当前会话即可,不需要长期存储。不要判断错误类型,只要返回不是正常的返回就继续尝试

Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 19, 2026

@copilot 修改,对于主服务器问题只需要当前会话即可,不需要长期存储。不要判断错误类型,只要返回不是正常的返回就继续尝试

已修改。移除了错误类型判断逻辑,现在任何错误都会继续尝试其他服务器。主服务器跟踪仅在当前会话有效(页面刷新后重置)。

Changes in commit f56c9b5:

  • Removed isNetworkError() function and error type detection
  • tryWithPrimaryServer() now retries on any error
  • Updated documentation to clarify session-only tracking

Copilot AI changed the title Optimize server rotation to use primary server first, rotate only on network errors Optimize server rotation to use primary server first, retry on any error Jan 19, 2026
@Sunwuyuan
Copy link
Member

@copilot review代码

Copy link
Contributor Author

Copilot AI commented Jan 19, 2026

@copilot review代码

Code Review Summary

已审查代码,整体实现符合要求。以下是当前实现的关键点:

✅ 优点

  1. 主服务器跟踪 - 使用模块级变量 primaryServerUrl 追踪当前工作的服务器,会话期间有效
  2. 简化的错误处理 - tryWithPrimaryServer() 对任何错误都进行重试,无错误类型判断
  3. 高效的回退机制 - 优先使用主服务器,失败后才尝试所有服务器
  4. 文档清晰 - 明确说明主服务器追踪仅在当前会话有效

⚠️ 需要注意的问题

1. 重复请求问题 (serverRotation.js:144)
tryWithPrimaryServer 回退到 tryWithRotation 时,会重复尝试主服务器:

  • 第139行:尝试主服务器失败
  • 第144行:tryWithRotation 会再次尝试主服务器(因为它在服务器列表中)

建议:在回退到 rotation 时,跳过已尝试的主服务器,或者记录已尝试的服务器。

2. 并发更新 (serverRotation.js:15)
多个并发请求可能同时更新 primaryServerUrl,虽然 JavaScript 是单线程,但异步操作可能导致更新顺序不一致。

当前实现可以接受,但如果遇到问题可以考虑添加更新时间戳。

3. 错误处理一致性 (kvServerProvider.js)
某些方法(如 loadData, saveData)在外层 catch 中返回错误,这些错误不会触发服务器轮换,因为 tryWithPrimaryServer 已经返回了。

当前实现:正确,因为这些是应用层错误处理。

总体评价

代码质量良好,实现了既定目标。主要优化空间在避免重复请求主服务器。

@Sunwuyuan
Copy link
Member

@copilot 对于服务器轮询尝试,不要区分主服务器和其他服务器,使用统一的逻辑,从上到下尝试

…l requests

Co-authored-by: Sunwuyuan <88357633+Sunwuyuan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 19, 2026

@copilot 对于服务器轮询尝试,不要区分主服务器和其他服务器,使用统一的逻辑,从上到下尝试

已修改。移除了主服务器优化逻辑,现在所有请求都使用统一的 tryWithRotation,从上到下依次尝试服务器。

Changes in commit 17f7b46:

  • 移除 primaryServerUrl 变量和追踪逻辑
  • 移除 tryWithPrimaryServer 函数
  • 所有 API 方法统一使用 tryWithRotation
  • 服务器按顺序尝试,无优化策略

现在所有请求都从第一个服务器开始,失败后依次尝试下一个,逻辑简单统一。

Copilot AI changed the title Optimize server rotation to use primary server first, retry on any error Fix server rotation resource waste with unified sequential retry logic Jan 19, 2026
@Sunwuyuan Sunwuyuan merged commit f0a8231 into main Jan 19, 2026
4 of 6 checks passed
@wuyuancat
Copy link
Collaborator

此拉取请求已在 Kernyr 上被提及。那里可能有相关详细信息:

http://kernyr.wuyuan.dev/t/topic/176/1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants